Evolution of Avian Influenza Virus (H3) with Spillover into Humans, China

The continuous evolution of avian influenza viruses (AIVs) of subtype H3 in China and the emergence of human infection with AIV subtype H3N8 highlight their threat to public health. Through surveillance in poultry-associated environments during 2009–2022, we isolated and sequenced 188 H3 AIVs across China. Performing large-scale sequence analysis with publicly available data, we identified 4 sublineages of H3 AIVs established in domestic ducks in China via multiple introductions from wild birds from Eurasia. Using full-genome analysis, we identified 126 distinct genotypes, of which the H3N2 G23 genotype predominated recently. H3N8 G25 viruses, which spilled over from birds to humans, might have been generated by reassortment between H3N2 G23, wild bird H3N8, and poultry H9N2 before February 2021. Mammal-adapted and drug-resistance substitutions occasionally occurred in H3 AIVs. Ongoing surveillance for H3 AIVs and risk assessment are imperative for potential pandemic preparedness.


Virus isolation and identification
Viruses were isolated in 9 to 11-day specific-pathogen-free (SPF) embryonated chicken eggs. After incubation for 48-72 hours at 37°C, the presence of the virus in the allantoic fluids of eggs was identified by a hemagglutination test using 1% turkey red blood cells (TRBC).

RNA extraction and genome sequencing
Virus RNA was extracted from the isolated viruses using the MagMAX CORE Nucleic Acid Purification Kit (Thermo Fisher, Waltham, MA). The extracted RNA was subjected to reverse transcription and amplification using the SuperScript® III One-Step RT-PCR system (Thermo Fisher, Waltham, MA) according to the described method (1).Whole genome sequencing of influenza A virus was implemented on the automatic Applied Biosystems 3730xl DNA Analyzer (Life Technologies, USA) or MiSeq highthroughput sequencing platform (Illumina, Inc., San Diego, CA, USA). The raw data from MiSeq platform were paired reads with length of 150 bp. Low-quality reads were trimmed, and the filtered reads were sampled and de novo-assembled using Velvet (version 1.2.10) (2) and Newbler (version 2.5). Contigs were blasted against a database containing all influenza A virus nucleotide sequences collected from the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/genomes/FLU) and the Global Initiative on Sharing All Influenza Data (GISAID) (http://www.gisaid.org). Sequences with the highest similarity were selected as references for mapping of reads using Bowtie 2 (version 2.1.0) (3). The influenza A virus genome sequences were obtained by extracting the consensus sequences from the mapping results, with a coverage depth of at least 30 times at each site on the eight segments.

Sequence collection and alignment
All the sequences of 188 environmental H3 isolates were initially obtained in our surveillance (Appendix Table 1) and 32 of them have been published in our previous study (4). H3, N2, N3, N6, N8 and internal gene sequences of avian-origin viruses were downloaded from the GISAID EpiFlu database as of June 25, 2022. Sequences of two human H3N8 viruses isolated in Henan and Hunan provinces were also included. The resulting sequences of each segment were aligned using MAFFT software v7.222 (5) and manually adjusted to correct frameshift errors in MEGA v7.0 (6). The coding region of each segment was retained and the signal peptide was removed from the HA segment.
Sequences with ≥97% of the length of coding region and ≤3 degenerate bases were included for molecular characterization and phylogenetic analyses. To comprehensively elaborate the evolution of H3 AIVs in China, we extended the selection of Chinese sequences. The sequences collected in China were retained with less than 4 degenerate bases and no less than 90% of the length of coding region of the segment. For the HA gene, sequences with complete HA1 domain were also included.

Phylogenetic analyses
Maximum likelihood (ML) phylogenies of all segments were reconstructed using the FastTree software v2.1.11 (7) under the GTRGAMMA model with 1,000 bootstrap replicates. All H3-HA gene sequences were included in the phylogenetic analyses. The datasets of NA genes and internal genes were reduced using cd-hit to improve the computational efficiency. All H3 virus strains isolated from avian host and environment samples in China (Figure 1; Appendix Figure 15) as well as two human H3N8 virus strains were retained for reconstructing evolutionary relationships. We also retrieved the sequences that shared the highest similarity with Chinese H3 AIVs by BLAST. The resulting trees were classified into divergent lineages or sublineages according to the topology of phylogenetic trees with bootstrap support values ≥ 70%. Clusters from Eurasian wild bird gene pool were then manually merged, if necessary, based on the bootstrap support values and reported classification (8). The ggtree package in R(9) and FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/) were used for visualization and annotation.

Genotypic analysis
There is remarkably frequent occurrence of reassortment in AIV, that formed the wild bird gene pool. Studies on large number of sequences provided little evidence for the elevated fitness of specific gene combinations in AIV isolates from wild birds (10).

Molecular dating
To estimate the time to the most recent common ancestor (tMRCA) of human H3N8 viruses, smaller datasets for eight gene segments containing human H3N8 virus strains were selected to run time-measured Bayesian Markov chain Monte Carlo (MCMC) analysis implemented in BEAST v1. 10.4 (11). Root-to-tip genetic divergence against sampling dates was analyzed using TempEst v1.5.3 (12) to investigate the temporal signal of our datasets. The best-fitting nucleotide substitution models (GTR+G for the MP gene, and GTR+G+I for other gene segments) were identified according to the Corrected Akaike Information Criterion in MEGA v7.0 (6). A relaxed clock model with uncorrelated lognormal distribution was used. A Bayesian skyride model with time-aware smoothing was used (13). The default distribution was used for prior CP1 + 2, CP3, and ucld.mean. Bayesian MCMC analyses were run for 50 million steps, with sampling every 5,000 steps. Multiple independent MCMC trajectories were computed and combined. The first 10% states were removed. Final effective sample size as assessed in Tracer v1.7.2.

Evolution of other H3 sublineages
Chinese H3 AIVs were also found scattered in sublineages Asia, Europe-Asia, worldwide-1, and worldwide-2 ( Figure 2; Appendix Figure 1). Both poultry and wild bird isolates were found in each sublineage.

Reassortment with N2 genes
The N2 genes of AIVs in the Eurasian lineage could be further classified into 4 major sublineages: Eurasian-1, Eurasian-2, waterfowl H6N2, and poultry H9N2 sublineages (Appendix Figure 2A) The NA genes of the H3N3 AIVs in China were all derived from the Eurasian lineage (Appendix Figure 2B). Notably, one H3N3 strain sequenced in our study, A/Environment/Hunan/13561/2020, showed a close relationship with the NA genes of human-origin H10N3 virus and H10N3 viruses circulating in chicken during 2019-2021.
Another H3N3 strain sampled in 2020 was grouped with duck H7N3 viruses circulating in China during 2010-2018.

Reassortment with N6 genes
The NA genes of the H3N6 AIVs in China were scattered in the Eurasian lineage (Appendix Figure 2C). H3N6 viruses isolated from poultry and wild bird in Jiangxi, Guangdong, Guangxi, and Hunan provinces during 2014-2015 were clustered with highly pathogenic AIV (HPAIV) H5N6.

Reassortment with internal genes
To explore the pattern of reassortment of H3 AIVs in China, we reconstructed the phylogenies for all six internal genes (Appendix Figure 3). The internal genes of all the H3 AIVs sequenced through our surveillance belonged to the Eurasian lineage, except for the M gene of A/Environment/Sichuan/32281/2016(H3N2), as previously described (4).
A large proportion of H3 AIVs in China fell into the Eurasian wild bird reservoir (Appendix Figure 3). Many H3 AIVs contained internal genes (especially PA, NP and, M genes) from the ZJ-5 sublineage, of the wild bird viral gene pool, which consisted mainly of viruses isolated from domestic waterfowl in China. Very few H3 AIVs derived their  Figure 5A). Of the eight common genotypes, G6, G23, and G25 have been circulating for many years and spread to more regions (Appendix Figure 4A and 5A). Notably, the G23, which had been detected as early as 2014, continued circulating and was monitored in 2022. The H3N2 G23 viruses contained HA genes from the China-1 sublineage, NA genes from the Eurasian lineage, M genes from the ZJ-5 sublineage, and other 5 internal genes from the Eurasian wild bird gene pool (Appendix Table 4).